Memory lower bounds for XPath evaluation over XML streams

نویسنده

  • Prakash Ramanan
چکیده

We consider the XPath evaluation problem: Evaluate an XPath query Q on a streaming XML document D. We consider two versions of the problem: 1). Filtering Problem: Determine if there is a match for Q in D. 2). Node Selection Problem: Determine the set Q(D) of document nodes selected by Q. We consider Conjunctive XPath (CXPath) queries that involve only the child and descendant axes. Let d denote the depth of D, and n denote the number of location steps in Q. Bar-Yossef et al. presented lower bounds on the memory space required by any algorithm to solve these two problems. Their lower bounds apply to each query in a large subset of XPath, and are obtained (mostly) using nonrecursive (Q,D). In this paper, we present larger lower bounds for a different class of queries (namely, CXPath queries with independent predicates), on recursive (Q,D). One of our results is an Ω(n · maxcands(Q,D)) lower bound for the node selection problem, for a worst-case Q; maxcands(Q,D) is the maximum number of nodes of D that can be candidates for output, at any one instant. So, there is no algorithm for the node selection problem that uses O(f(d, |Q|) +maxcands(Q,D)) space, for any function f . This shows that some previously published algorithms are incorrect.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Worst-case optimal algorithm for XPath evaluation over XML streams

We consider the XPath evaluation problem: Evaluate an XPath query Q on a streaming XML documentD; i.e., determine the setQ(D) of document elements selected byQ. We mainly consider Conjunctive XPath queries that involve only the child and descendant axes. Previously known in-memory algorithms for this problem use O(|D|) space and O(|Q||D|) time. Several previously known algorithms for the stream...

متن کامل

FluXQuery: An Optimizing XQuery Processor for Streaming XML Data

XML has established itself as the ubiquitous format for data exchange on the Internet. An imminent development is that of streams of XML data being exchanged and queried. Data management scenarios where XQuery [11] is evaluated on XML streams are becoming increasingly important and realistic, e.g. in e-commerce settings. Naturally, query engines employed for stream processing are main-memory-ba...

متن کامل

Towards a Streamed XPath Evaluation

XPath is a language for addressing fragments of XML documents, used in query and transformation languages such as XQuery and XSLT. For many applications it is desirable to process XPath on the fly and progressively against data streams. This diploma thesis is devoted to streamed and progressive evaluation of XPath. A streamed and progressive XPath evaluation considerably reduces the needed memo...

متن کامل

Evaluating XPath Queries on XML Data Streams

Whenever queries have to be evaluated on XML data streams or when the memory that is available to evaluate the XML data is relatively small compared to the document DOM based approaches that have to load and store large parts of the document in main memory will fail. In comparison, we present an approach to evaluate XPath queries on SAX streams that supports all axes of core XPath, including th...

متن کامل

IRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES The Space Complexity of Processing XML Twig Queries Over Indexed Documents

Current twig join algorithms incur high memory costs on queries that involve child-axis nodes. In this paper we provide an analytical explanation for this phenomenon. In a first large-scale study of the space complexity of evaluating XPath queries over indexed XML documents we show the space to depend on three factors: (1) whether the query is a path or a tree; (2) the types of axes occurring i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Comput. Syst. Sci.

دوره 77  شماره 

صفحات  -

تاریخ انتشار 2011